Katz's back-off model について

Words near each other

・ Katyń Memorial
・ Katyń Memorial (Jersey City)
・ Katyń Museum, Warsaw
・ Katz
・ KATZ (AM)
・ Katz (surname)
・ Katz and Leavitt Apartment House
・ Katz Broadcasting
・ Katz Castle
・ Katz centrality
・ Katz Editores
・ Katz Group of Companies
・ Katz railway station
・ Katz syndrome
・ Katz v. United States
・ Katz's back-off model
・ Katz's Delicatessen
・ Katz, British Columbia
・ Katza
・ Katzbach (Kraichbach)
・ Katzbach Railway
・ Katzbalger
・ Katze
・ Katze (village)
・ Katze im Sack
・ Katzelmacher
・ Katzelsdorf
・ Katzen
・ Katzen (performer)
・ Katzen Arts Center

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

Katz's back-off model ：ウィキペディア英語版

Katz's back-off model
Katz back-off is a generative ''n''-gram language model that estimates the conditional probability of a word given its history in the ''n''-gram. It accomplishes this estimation by "backing-off" to models with smaller histories under certain conditions. By doing so, the model with the most reliable information about a given history is used to provide the better results.
==The method==

The equation for Katz's back-off model is: 〔Katz, S. M. (1987). Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing, 35(3), 400–401. 〕
:

) \alpha_} P_(w_i \mid w_ \cdots w_) & \text\end\end

where
: ''C''(''x'') = number of times ''x'' appears in training
: ''w''_''i'' = ''i''th word in the given context
Essentially, this means that if the ''n''-gram has been seen more than ''k'' times in training, the conditional probability of a word given its history is proportional to the maximum likelihood estimate of that ''n''-gram. Otherwise, the conditional probability is equal to the back-off conditional probability of the "(''n'' − 1)-gram".
The more difficult part is determining the values for ''k'', ''d'' and ''α''.

k

is the least important of the parameters. It is usually chosen to be 0. However, empirical testing may find better values for k.

d

is typically the amount of discounting found by Good–Turing estimation. In other words, if Good–Turing estimates

C

C^*

, then

d = \frac

To compute

\alpha

, it is useful to first define a quantity β, which is the left-over probability mass for the (''n'' − 1)-gram:
:

\beta_} = 1 - \sum_) > k \} } d_} \frac w_)})}

Then the back-off weight, α, is computed as follows:
:

\alpha_} = \frac}}        ) \leq k \} } P_(w_i \mid w_ \cdots w_)}

The above formula only applies if there is data for the "(''n'' − 1)-gram". If not, algorithm skips N-1 entirely and uses the Katz estimate for N-2. (and so on until an N-gram with data is found)

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Katz's back-off model」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース